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Abstract 

We considered a dsDNA polymer in which distribution of bases are random at the base pair 
level but ordered at a length of 18 base pairs and calculated its force elongation behaviour in the 
constant extension ensemble. The unzipping force F{y) vs. extension y is found to have a series 
of maxima and minima. By changing base pairs at selected places in the molecule we calculated 
the change in F(y) curve and found that the change in the value of force is of the order of few pN 
and the range of the effect depending on the temperature, can spread over several base pairs. We 
have also discussed briefly how to calculate in the constant force ensemble a pause or a jump in 
the extension-time curve from the knowledge of F{y). 

PACS numbers: 87.14.Gg, 87.15.Aa, 64.70.-p 
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I. INTRODUCTION 



DNA is a giant double stranded linear polymer in which genetic information is stored 
[3| . Its double stranded helical structure is stabilized by the hydrogen bonding between 
complimentary bases (A-T are linked with two hydrogen bonds and G-C by three hydrogen 
bonds) and the stacking interactions of the base pair plateaux. The stacking interactions 
which impose a well defined distance between the bases and give rise high rigidity to the 
polymer along its axis depend on the genome sequence j^, [j. The energy landscape of 
a dsDNA polymer is therefore expected to depend on the arrangement of bases along the 
two strands. Knowing this dependence is an important step in understanding the biological 
functioning of DNA. 

With the development of single molecule techniques it has now become 

possible to probe the force elongation characteristics of a double stranded DNA (dsDNA) 
polymer and measure its response to an external force or torque in vitro at temperatures 
where dsDNA is thermally stable. Such measurements give informations about the energy 
landscape of the molecule. Experiments have usually been performed either in the constant 
extension 0] or in the constant force ensemble Q|. In the constant extension ensemble the 
average force of unzipping is found to vary randomly about an average value as the extension 
is increased j9|, while in the constant force ensemble the unzipped length as a function of 
time is found to show several pauses and long jumps jiol ]. 

A number of theoretical efforts have recently been made to understand various aspects of 



dsDNA unzipping 
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151 ]. It is shown that while a homogeneous dsDNA gains 



considerable entropy by opening in response to the external force and therefore the unzipping 
is entropy driven, a heterogeneous dsDNA is believed to unzip primarily for energetic reasons 
jisl ]. Lubensky and Nelson Q] have studied the force induced unzipping of a randomly 
disordered dsDNA using a Hamiltonian which is coarse grained over many but unknown 
number of bases. Weeks et al have used the model of Lubensky and Nelson 15] and 
have calculated the pause point spectrum in the constant force ensemble of a A phage DNA. 

Our aim in this article is to use a Hamiltonian which describes interactions at the base pair 
level and show that the force-extension curve obtained in the constant extension ensemble 
provides a more direct exploration of the underlying free energy landscape from the maxima 
and minima of the force profile. The model described in Sec. 2 also allows us to calculate 
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the effect of base pair mutation on the force extension behaviour. 

We consider a dsDNA polymer of N(= Mn) base pairs made by repeating MiM — > oo) 
times an oligonucleotide of n base pairs. The oligonucleotide used here to construct the 
dsDNA polymer has 18 base pairs of which 9 are A-T(or T-A) and other 9 are G-C(or C-G). 
The arrangement of these base pairs in the oligonucleotide is as shown below; 

3' - AGTGACATACTCGGACGA - 5' 
5' - TCACTGTATGAGCCTGCT - 3' 

The dsDNA polymer constructed in this way is heterogeneous as it contains both A-T and 
G-C base pairs. However, because of the repetition of the oligonucleotide the distributions 
of bases in the dsDNA polymer are not random but have a periodicity at the length of the 
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oligonucleotide. We therefore expect its properties to lie in between a homogeneous 
and a randomly disordered Jl5| dsDNA polymer. 

II. THE MODEL AND ITS THERMODYNAMICS 

We represent the interactions in the polymer at the base pair level by the model Hamil- 
tonian of Peyrard and Bishop (PB) 16| which though ignores the helicoidal structure of 
the dsDNA polymer, has enough details to analyze mechanical behaviour at few A scale 
relevant to molecular-biological events 01 and can easily be extended to include the effect 
of heterogeneity in the base pair sequence. The PB model for a heterogeneous DNA polymer 
is written as, 
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^ + V i (y i ) + W{y i ,y i+1 ] 



(1) 



where m is the reduced mass of a base pair (taken to be same for both A-T and G-C base 
pairs and equal to 300 a.m.u.), ^ denotes the stretching of the hydrogen bonds connecting 
the two bases of the i th pair and pi = m(dyi/dt). The on-site potential V(yi) which describes 
interactions of two bases of the i th pair is represented by the Morse potential 

V t (y l )=D l [e- a ^-l} 2 (2) 

where Di measures the depth of the potential and a; its range. Both Di and Oj depend on 
whether the i th base pair is A-T or G-C. The stacking interaction term of the PB model 
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TABLE I: Value of A/jj j + i for all possible combination of two consecutive base pairs. Two consec- 
utive bases along a strand is shown. Other two bases of the quartet are complementary to these. 
These values are found using the data of stacking energy of Sponer et al j3] and taking p = 5.0. 



Base Quartet 

(3'-5>) 


A/^j+i 


Base Quartet 

(3'-5>) 


Api, i+ i 


AA 


-0.10 


GA 


-0.10 


AT 


-0.77 


GT 


-0.14 


AG 


-0.53 


GG 


-0.34 


AC 


-0.14 


GC 


+2.39 


TA 


-0.53 


CA 


+0.36 


TT 


-0.10 


CT 


-0.53 


TG 


+0.36 


CG 


+0.48 


TC 


-0.10 


CC 


-0.34 



is modified and is written as 
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W( Vi , y i+1 ) = 2 k [ 1+ Pi,i+ie- b[yi+Vi+l) \ (Vi ~ Vi+iY (3) 

where force parameter k measures the stiffness of a single strand of the molecule and the 
second term in the bracket represents the anharmonic term. The strength of anharmonic 
term is measured by p^+i and its range by b. We allow the value of the parameter PiA+i{= 
p + Ap^j+i) to depend on the arrangement of four bases in the consecutive base pairs i 
and {i + 1). In our calculation we have taken Dat = 0.058 eV,D GC = 0.087 eV, ciat = 
4.2 A-\a GC = 6.3 A-\b = 0.35 A~\k = 0.02 eVA" 2 ,p = 5.0 and used the data of 
stacking energies given by Sponer et al [18j to estimate the value of Ap iti+ i. We list in Table 
1 the value of Ap^+i for all possible combinations of two consecutive base pairs. We treat 
the nucleotide the repetition of which forms the dsDNA polymers as an effective base pair 
and define its kernel as, 

K{y u y n ) = J dy 2 ,....dy n -iK( y y 1 ,y 2 ),..., 

xK{y i ,y i+x ) } ...,K{y n _ l ,y n ) (4) 
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and 

H{y u y i+1 ) = - [V{ yi ) + V(y i+ i)) + W( yi , y i+1 ) 

where 

K(Vu Vi+i) = ( 2~ J exp[-(3H( yi ,y i+1 )] (5) 

Equation (4) is evaluated by the method of matrix multiplication. For this we chose 
—5.0 A and 200.0 A, respectively, as the lower and upper limits of integration for each 
variable and discretized the space using the Gaussian quadrature with number of grid points 
equal to 900. The resulting matrix K(yx,y n ) is a 900 x 900 square matrix. 
The configurational partition function Z C N defined as jl^j ]. 

. M 

z n = J II d Vv K {vvi y P +i)$(yi - vn+i) (6) 
p=i 

has been evaluated by the matrix multiplication method for several values of N ranging 
between 3000 to 6000. As all matrices in Eq.(6) are identical the multiplication is done very 
efficiently The resulting partition function is used to calculate the free energy per base pair 
from the following relation 

1 [ 4n 2 k%T 2 m \ k B T 

f = — 2 ln [ — i — ) - — ln z » W 

where the first term on the r.h.s. is due to the kinetic energy. We found that for N > 4000 
the value of / is independent of the value of iV taken in the calculation of Z C N . 

We have also diagonalized the matrix K(y p , y p +i) to find first two eigenvalues Ao and Ai 
where A = exp(— (3E Q ) and Ai = exp(— (3 Ex), using a method described in ref Since E 
and Ex are the eigenenergies of a kernel having n — 18 base pairs, the average eigenenergies 
per base pair is = Ei/n. In Fig. 1(a) we plot the eigenenergies eo and e\. We find that 
Ae(T) = eo — ei oc (To — T) v with v — 1 and To = 356.7 K. The behaviour of Ae(T) as a 
function of T is found to be same as in the case of a homogeneous dsDNA polymer [lj . In 
the thermodynamic limit the value of configurational partition function Z 1 ^ is determined 
by A and therefore Z C N = Ag f = exp(— j3Ne ). The free energy per base pair calculated 
using this value of Z C N agrees very well with the values found from Eq.(7). 

The value of / as a function of temperature T is shown in Fig. 1(b). A cusp in / at 
the thermal denaturation temperature To = 356.7 K is clearly seen. The existence of cusp 
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FIG. 1: (a) The average per base pair eigenenergies eo and e\ of the kernel of Eq.(4) as a function 
of temperature are shown, (b) The free energy per base pair as a function of temperature is shown. 
A cusp in the free energy is seen at the thermal denaturation temperature T = Td = 356.7 K 
where Ae = eo — ei ~ 0. 



indicates that the thermal denaturation transition is first order with a sudden jump in its 
entropy. 

Next we calculate the unzipping of the polymer in the constant extension ensemble in 
which separation of one ends of the two strands of the molecule is kept fixed and the average 
force needed to keep this separation is measured. The work done in stretching the base pair 
1 to y distance apart is ^| 

W(y) = \v x {y) - k B T[lnZ c n (y) - \nZ%] (8) 

where 

r M 

Z c N (y) = I II *VAV1 ~ V) S (VN - 0)K(y p , y p+1 ) (9) 
P =i 
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and is denned by Eq.(6). The force F(y) as a function of extension y is found from the 
relation, 

™ = ™ («» 

III. RESULTS AND DISCUSSIONS 

In Fig. 2 we plot the value of as a function of extension y for T = 100 K and 300 
K. To show the height and width of peak in F(y) at small values of y as well as oscillations 
at larger values of y we chose different scales on the two sides of y = 5.0 A. Though the 
experiments are generally done at temperatures close to 300 K, the motivation for studying 
the behaviour of dsDNA at 100 K is to illustrate the effect of temperature on the energy 
landscape. Figure 2 shows a large force barrier a t y ~ 0.2 A, a feature similar to that found 
in the case of a homogeneous dsDNA polymer Q] . The height of this peak is nearly 
230 pN at 100 K and 215 pN at 300 K. The physical origin of this barrier is in the potential 
well due to hydrogen bonding plus the additional barrier associated with the reduction in 
DNA strand rigidity as one passes from dsDNA to ssDNA. Since the peak corresponds to 
a process in which only one or two base pairs participate the effect of thermal energy in 
formation of the barrier is small. 

For y > 1.0 A we, however, find that the two curves of F(y) considerably differ from each 
other as well as from that of a homogeneous DNA polymer. While in case of a homopolymer 
the peak in F(y) decays as y is increased and for y > 10.0 A attains a constant value equal 
to that of the critical force found from the constant force ensemble to unzip the dsDNA 
into two ssDNA Q], here we find that F(y) curve oscillates about a mean value. These 
oscillations are due to maxima and minima in the energy landscape and these maxima and 
minima depend on the genome sequence in the two strands of the DNA polymer. It is easy 
to realize that a G-C rich region of the polymer has energy minimum whereas the A-T rich 
region has energy maximum. Therefore, when the front of the unzipping fork enters the G-C 
rich region it needs larger force to come out of it whereas in case of the A-T rich region it 
needs less force than the average to move forward. As is evident from Fig. 2, the maxima 
and minima in the energy landscape are much more pronounced at 100 K compared to that 
at 300 K; the thermal fluctuations have tendency to suppress the local variation and make 
the energy landscape relatively smooth. 
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FIG. 2: The average force F(y) in pN at T = 100 K and 300 K required to stretch one of the ends 
base pair to a distance y is shown. Different scales are chosen for the two sides of y = 5.0 A. 



At T = 300 K the oscillations in F(y) are found to decay (see Fig. 2) rather rapidly and 
for extension greater than 100 A the dsDNA polymer seems to behave like a homopolymer 
in which unzipping takes place continuously at constant rate when the applied force exceeds 
the critical force. But at T = 100 K the oscillations in F(y) persists for much larger values 
of y. As the extension y increases, however, the wiggles in F(y) get smoothened and the 
peak heights decrease though very slowly. These features arise due to the contribution made 
to the free energy by the fluctuations of the single stranded part of the unzipping fork. As 
y increases the single stranded part increases resulting in larger entropic contributions to 
the free energy and thus reducing the barrier encountered by the front of the fork. After 
certain length of the ssDNA part the entropic contribution to free energy per base pair gets 
saturated and oscillations in F(y) if not already suppressed will remain unaffected on further 
increasing the extension. Therefore the effect of genome sequence on the unzipping depends 
on the depth of the local energy minimum. In the case of the dsDNA polymer considered 
here the variations in the energy along the polymer gets averaged out at 300 K and therefore 
the unzipping beyond about 100 base pairs becomes similar to that of a homopolymer. But 
at 100 K the local minimum in the energy landscape are deep enough to show variations in 
the unzipping force for large enough extensions. 
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FIG. 3: The change in F(y) when a base pair C-G at position 10 is changed by a base pair T-A 
(dotted line) and when a base pair A-T at position 15 is changed by a base pair G-C (dashed 
line) is shown at temperatures 100 K and 300 K. The full line corresponds to the original dsDNA 
polymer. This change is due to the combined effect of the change in the on-site and the stacking 
potentials. 



To examine more closely the sensitivity of the force - extension curve on the genome 
sequence we altered base pairs on selected positions and calculated their effects on the F(y) 
curve. We indicate a base pair by its number counted from the stretched end taking the 
base pair that is being stretched as 1. First we alter a base pair at one position and calculate 
its effect. In Fig. 3 we show our results for two cases, (i) a base pair C-G at position 10 is 
replaced by a base pair T-A (shown by dotted line) and (ii) a base pair A-T at position 15 
is replaced by a base pair G-C (shown by dashed line). These replacements have brought 
changes in both the on-site potential V(y) and in the stacking interactions. The change in 
the stacking interactions is measured by the change in Ap which for base pairs 9-10 has 
changed from -0.14 to -0.77 and for the base pairs 10-11 from -0.53 to -0.10. When the 
base pair at position 15 has been changed from A-T to G-C the value of Ap has changed 
for base pairs 14-15 from -0.10 to -0.34 and for base pair 15-16 from -0.14 to +2.39. The 
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FIG. 4: The change in the value of F(y) when a base pair in each repeating nucleotide is introduced 
at T = 100 K and 300 K. The dotted line corresponds to the change introduced at positions 10, 
28, 46, ... by replacing C-G base pair by T-A base pair. Similarly the dashed line corresponds to 
the situation when the base pair A-T from positions 15, 33, 51, . . . is replaced by the G-C base 
pair. The full line corresponds to the original dsDNA polymer. 

change in the F(y) curve brought by the change in the base pair sequence is therefore due 
to the combined effect of the change in the on-site and the stacking potentials. From Fig. 3 
we note that the change in F(y) value is of the order of 5-7 pN and the range of the effect 
spreads about 14-17 A. This means that while the effect is localized to about 7-8 base pairs 
at 100 K, at 300 K it spreads to almost the length of the oligonucleotide. 

In Fig. 4 we compare the results found by replacing a base pair in each repeating nu- 
cleotide at the same position, i.e. a periodic change in the base pair sequence with the length 
of periodicity equal to that of the oligonucleotide. For example, the change introduced at 
positions 10 now repeats along the polymer at positions 10, 28, 46, 64,... and the change 
made at position 15 now repeats along the polymer at positions 15, 33, 51, 69,.. While 
the change introduced in the F(y) curve due to this change in genome sequence follow the 
periodicity of the change at 100 K, at 300 K the entire curve either moves up or down by 
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about 1 pN for y > 50 A. 



IV. CONCLUSIONS 

We have modified the stacking energy part of the PB Hamiltonian so that the effect 
arising due to genome sequence in a dsDNA molecule is fully accounted for. Using the 
method of matrix multiplications we have done essentially exact thermodynamics of this 
Hamiltonian in the constant extension ensemble. The results given above suggest that the 
genome sequence has a very specific effect on the unzipping of a dsDNA polymer and can 
therefore be used to find this sequence by determining the force extension curve in the 
constant extension ensemble. Any change in this sequence can change the force-extension 
curve along the length of few base pairs. Such a change in unzipping process may have an 
important effect on the DNA transcription and replication dynamics. 

The above results can also be used to calculate the unzipping properties of the molecule 
in the constant force ensemble. For example, the constant force ensemble partition function 
can be found from the relation, 



and the time needed to cross a force peak (or valley) encountered by the front of the unzipping 
fork from the relation 



where F(yi) — F = F(y 2 ) — F = and to is time needed by the front to move the distance; 
At/ = 2/2 — 2/1 under the influence of the force F when there is no peak or valley. A 
natural extension of the method developed here is to apply it to estimate the force-elongation 
behaviour of a natural DNA [soj ] - 

Experimental results both in the constant extension and in the constant force ensemble are 
available for lambda phage DNA which is 48,502 base pairs long. This DNA is particularly 
interesting as it consists of a GC rich half connected to an AT rich half and therefore expected 
to have different energy landscape viewed from the two ends. Though for quantitative 
comparisions between the experimental results and the theory one has to consider the genome 
sequence of the DNA from the end used in the experiment, the qualitative features of the 
experimental results are in agreement with the features of the force-extension curve discussed 



Z%{F) = Jdy Z%{y) = Z< N j ' dy e^~ w ^ 



(11) 




(12) 
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above. It may also be pointed out that while the method described above give the force 
behaviour at the base pair level the results found experimentally are still at a level of several 
base pairs. 
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